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Abstract 

Robustness is a critical measure of the resilience of large net¬ 
worked systems, such as transportation and communication 
networks. Most prior works focus on the global robustness of 
a given graph at large, e.g., by measuring its overall vulnera¬ 
bility to external attacks or random failures. In this paper, we 
turn attention to local robustness and pose a novel problem 
in the lines of subgraph mining: given a large graph, how can 
we find its most robust local subgraph (RLS)? 

We define a robust subgraph as a subset of nodes with 
high communicability M among them, and formulate the 
RLS -Problem of finding a subgraph of given size with 
maximum robustness in the host graph. Our formulation 
is related to the recently proposed general framework 1391 
for the densest subgraph problem, however differs from it 
substantially in that besides the number of edges in the 
subgraph, robustness also concerns with the placement of 
edges, i.e., the subgraph topology. We show that the RLS- 
Problem is NP-hard and propose two heuristic algorithms 
based on top-down and bottom-up search strategies. Further, 
we present modifications of our algorithms to handle three 
practical variants of the RLS -Problem. Experiments on 
synthetic and real-world graphs demonstrate that we find 
subgraphs with larger robustness than the densest subgraphs 
El |39l even at lower densities, suggesting that the existing 
approaches are not suitable for the new problem setting. 

1 Introduction 

Complex networked systems, such as the Internet, road 
networks, communication networks, the power grid, etc., are 
a major part of our modern world. The performance and 
reliable functioning of complex networks depend on their 
structural robustness, e.g., their ability to retain functionality 
in the face of damage to parts of the network l40l . 

Robustness has been studied in many fields including 
physics, biology, mathematics, and networking. The re¬ 
search areas include quantifying robustness of a network ifTTl 
EsIEtIEII, studying the response of networks to various at¬ 
tack strategies |[Il[6l[T0l[T3lE2lESll, manipulating a network 
to improve its overall robustness Eciisiiia, and designing 
optimally robust networks from scratch ifTTlI^I^ISTl . 

A vast majority of prior work has focused on the global 
robustness of graphs at large. On the other hand, research 


on local robustness is limited to a few works, e.g., on 
finding robust subgraphs with large spectral radius |EI and 
identifying critical regions Oil. In this paper, we turn 
attention to local robustness and pose a novel subgraph 
mining problem: given a large graph, how can we find its 
most robust local subgraph of a given size? 

Our measure of robustness is the natural connectivity 
which is based on the reachability of the nodes, also phrased 
as their “communicability” ED. As we introduced in prior 
work m, it exhibits several desirable properties; e.g., it cap¬ 
tures redundancy by quantifying the count and length of 
alternative/back-up paths between the nodes. As such, ro¬ 
bust subgraphs are intuitively sets of nodes with high com¬ 
municability among each other. From the practical point of 
view, they may form the cores of larger communities or con¬ 
stitute the central backbones in large networks, maintaining 
connectivity and communication at large ifTHl . 

While the robust subgraph problem has not been studied 
before, similar problems have been addressed (Q. Probably 
the most similar to ours is the densest subgraph problem, 
aiming to find subgraphs with highest average degree ElEl 
ED or edge density (211 EH. However, density is different 
from robustness; while the former concerns with the number 
of edges in the subgraph, the topology is also of concern for 
the latter ( §2.2| ). We offer the following contributions. 


• We formulate a new problem of finding the most robust 
local subgraph (RLS) in a given graph. While in 
the line of subgraph mining problems, it has not been 
studied theoretically before ( §3.1| ). 

• We show that RLS-Problem is NP-hard, and further 
study its heredity and monotonicity properties ( §3.2| ). 

• We propose two fast heuristic algorithms to solve the 
RLS-Problem for large graphs: a top-down greedy 
algorithm that iteratively removes nodes, and a bottom- 
up approach based on the greedy randomized adaptive 
search procedure (GRASP) ifT^ (Q. 

• We introduce three practical variants of the RLS- 
Problem (^3.3); and show how to modify our algo¬ 
rithms to address these problem variants (Q. 

We extensively evaluate our methods on both synthetic 
and real-world graphs. As our RLS-PROBLEM is a new one, 
we compare to three algorithms (one in 0, two in (39l) 
for the densest subgraph problem. We find subgraphs with 
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higher robustness than the densest subgraphs even at lower 
densities, demonstrating that the existing algorithms are not 
applicable for the new problem setting (Q. 

2 Background and Preliminaries 

2.1 Graph Robustness Robustness is a critical property 
of large-scale networks, and thus has been studied in physics, 
mathematics, computer science, and biology. As a result, 
there exists a diverse set of robustness measures, e.g., mean 
shortest paths, efficiency, pairwise connectivity, etc. lEl. 

In this paper, we adopt a spectral measure of robustness 
called natural connectivity 1411 . written as 

(2.1) A(G)=log(-y^e^0, 

1=1 

which can be thought of as the “average eigenvalue” of graph 
G, where Ai > A 2 > ... > denote a non-increasing 
ordering of the eigenvalues of its adjacency matrix A. 

Among other desirable properties 171, natural connectiv¬ 
ity is interpretable; it is directly related to the subgraph cen¬ 
tralities (SC) in the graph. The SC(i) of a node i is known 
as its communicability 113, and is based on the “weighted” 
sum of the number of closed walks that it participates in: 

n ^ \ k\ 

i=l i=l k=0 
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Figure 2: Robustness vs. Density of 100,000 connected subgraphs 
(blue dots) from a real-world email network. 

More notably, density concerns with the number of 
edges in the graph and not with the topology. On the other 
hand, for robustness the placement of edges (i.e., topology) 
is as much, if not more important. In fact, graphs with the 
same number of nodes and edges but different topologies are 
indistinguishable from the density point of view (Figure [T]). 

To illustrate further, we show in Figure]^ the robustness 
vs. density of example subgraphs, each of size 50, sample(|^ 
from a real-world email network (^ Table[T]). While the two 
properties are correlated, subgraphs with the same density 
can have a range of different robustness. In fact, among 
the samples, the densest and the most robust subgraphs are 
distinct, indicating that one does not always imply the other. 


where (A^)ii is the number of closed walks of length k 
of node i. The k\ scaling ensures that the weighted sum 
does not diverge, and longer walks count less. S(G) is 
also referred to as the Estrada index 113 which strongly 
correlates with the folding degree of proteins ca. 

Noting that = trace(A'') = Yh=i ^ and 

by Taylor series of the exponential function we can write 


^(g) = EE 

k=0i=l 



EE 


k\ 


i=l 


As such, natural connectivity is the normalized Estrada 
index and quantifies the “average communicability” in G. 

2.2 Robustness vs. Density Graph robustness appears to 
be related to graph density; however as we show here, there 
exist key distinctions between them. 

Eirstly, while density directly uses the number of edges 
e, such as as in average degree ||4l|8j|2T| or 
as in edge density ||28l|39l, robustness follows an indirect 
route; it quantifies the count and length of paths and uses 
the graph spectrum. Thus, the objectives of robust and dense 
subgraph mining problems are distinct. 

E— X 

A = 0.9564 A = 0.9804 A = 0.9965 

Eigure 1: Example graphs with the same density but different 
robustness, due to their distinct graph topology. 


3 Robust Local Subgraphs 

3.1 Problem Definition In their inspiring work 1391 , 
Tsourakakis et al. recently defined a general framework for 
subgraph density functions, which is written as 
US) = g{e[S]) - ah{\S\) , 


where S' C 1 / is a set of nodes, S 7 ^ 0, e[S] is the number of 
edges in the subgraph induced by S, a > 0 , and g and h are 
any two strictly increasing functions. 

Under this framework, maximizing the average degree 
of a subgraph (mED corresponds to g(x) = h(x) = logx 
and a = 1 such that 


In order to define our problem, we can relate the objec¬ 
tive of our setting to this general framework. Specifically, 
our objective can be written as 


f{S) = log 


|S| .A, 


I5I 


which is to maximize the average eigenvalue of a subgraph. 
Therefore, the objectives of the two problems are distinct, 
although they both fall under a more general framework l39l . 


^We create subgraphs by snowball sampling: pick a random node and progres¬ 
sively add its neighbors with probability p, and iterate in a depth-first fashion. Con¬ 
nectivity is guaranteed by adding at least one neighbor of each node. We use varying 
p G (0, 1) to control the tree-likeness, and obtain subgraphs with various densities. 












In the following we formally define our robust local 
subgraph mining problem, which is to find the highest 
robustness subgraph of a certain size (hence the locality) in 
a given graph, which we call the RLS- Problem. 

Problem 1. (RLS-Problem) Given a graph G = {V,E) 
and an integer s, find a subgraph with nodes S'* C y of size 
IS* I = s such that 

s 

f{S*) = log^e^-([^*l)-logs > f{S), V5 C F, |5| = s. 

i=l 

S* is referred as the most robust 5-SUbgraph. 

One can interpret a robust subgraph as containing a set 
of nodes having large communicability within the subgraph. 
Theorem 3.1. The optimal RLS-Problem is ^V-Hard. 
Proof See Appendix [A| 

3.2 Problem Properties Certain characteristics of hard 
combinatorial problems sometimes guide the development of 
approximation algorithms for those problems. In this work, 
we study two such characteristics, namely semi-heredity and 
subgraph monotonicity, for the RLS -PROBLEM. 

Problems that exhibit the (semi-)heredity or monotonic¬ 
ity properties often enjoy algorithms that explore the search 
space in a smart and efficient way. For example cliques ex¬ 
hibit heredity, i.e., all induced subgraphs are also cliques. 
This is a key property used in successful algorithms for the 
maximum clique problem, e.g., checking maximality by in¬ 
clusion is a trivial task and effective pruning strategies can be 
employed within a branch-and-bound framework. Other al¬ 
gorithms exploit monotonicity to employ “smart node order¬ 
ing” strategies to find iteratively improving solutions. Such 
orderings help starting with a promising node and sequen¬ 
tially adding the next node in the order such that the resulting 
subgraphs all satisfy some desired criteria, like a minimum 
density, which enables finding large solutions quickly. 
Theorem 3.2. Robustness A is not semi-hereditary. That 
is, a graph with X = a and s > 1 nodes is not always a 
strict superset of some graph with s — 1 nodes and X> a. 
Theorem 3.3. Robustness A is not subgraph monotonic. 
Proof. See Appendix]^ for definitions and proofs. 

Alas, robust subgraphs do not exhibit any of these prop¬ 
erties. This suggests that our RLS -PROBLEM is likely harder 
than the maximum clique and densest subgraph problems as, 
unlike robust subgraphs, (quasi-)cliques are shown to exhibit 
e.g., the (semi-)heredity property ||28]| . 

3.3 Problem Variants In Appendix we introduce three 
practical variants of our RLS -PROBLEM: finding (i) the 
most robust subgraph (no size constraint), (fi) top-k most 
robust s-subgraphs, and (Hi) the most robust s-subgraph 
including a given set of seed nodes. We also show how 
to adapt our algorithms for the RLS -PROBLEM to these 
variants (@- 


4 Finding Robust Local Subgraphs 

Given the hardness of the RLS-PROBLEM, we design two 
heuristic solutions. The first is GreedyRLS, a top-down 
approach that iteratively removes nodes to obtain a subgraph 
of desired size. This greedy strategy serves as a simple base¬ 
line. Our second and proposed solution GRASP-RLS is a 
bottom-up randomized approach in which we iteratively add 
nodes to build up our subgraphs. Both solutions order the 
nodes by their contributions to the robustness. 

4.1 Greedy Top-down Search Approach This approach 
iteratively and greedily removes the nodes one by one from 
the given graph G = (V,E), \V\ = n, \E\ = m, until a 
subgraph with the desired size s is reached. At each iteration, 
the node whose removal results in the maximum robustness 
of the residual graph is selected for removal]^ 

The removal of a node involves removing the node 
itself and the edges attached to it from the graph, where the 
residual graph becomes G[V\{i}]. Let i denote a node to be 
removed. Let us then write the updated robustness Aa as 

(4.2) Aa = log (^ ^ . 

As such, we are interested in identifying the node that 
maximizes Aa, or equivalently 

(4.3) 

max. + ... + 

_|_ g(A2—Ai)gAA2 _j_ ^ ^ ^ _j_ g(An-l—Ai)gAAn-l 

gAi j^gAAi < 026^^2 + . . . + ) 

where cfs denote Vj > 2 and Cj <1. 

4.1.1 Updating the eigen-pairs When a node is removed 
from the graph, its spectrum (i.e., the eigen-pairs (Aj,Uj)) 
also changes. Recomputing the eigen-values to compute ro¬ 
bustness A A every time a node is removed is computation¬ 
ally challenging. Therefore, we employ fast update schemes 
based on the first order matrix perturbation theory . 

Let A A and (AAj, Auj) denote the change in A and 
(Aj, Uj) Vj, respectively, where A A is symmetric. Suppose 
after the adjustment A becomes 

A = A + A A 

where each eigen-pair (Xj^Uj) is written as 

Xj = Xj + AXj and uj = uj + Auj 

Lemma 4.1. Given a perturbation A A to a matrix A, its 
eigenvalues can be updated by 

(4.4) AXj = Uj'AAuj. 

Proof. See Appendix |DT] 

^Robustness of the residual graph can be lower or higher; S{G) decreases due to 
monotonicity, but the denominator also shrinks to (s — 1) at every step. 





Since updating the eigenvalues involves using the eigen¬ 
vectors, which also change with node removals, we use the 
following to update the eigenvectors as well. 


Lemma 4.2. Given a perturbation A A to a matrix A, its 
eigenvectors can be updated by 

,. ^ f Ui'AAuj 

(4.5) Auj= 5 ;; 

Proof. See Appendix |D.2| 

4.1.2 Node selection for removal By using Lemma [AT 


-Ui 


we can write the effect of perturbing A with the removal of 
a node i on the eigenvalues as 

(4.6) AAj = Uj'AAuj = —2uij ^ Uvj 

veJV{i) 

where AA(i,v) = AA(v,i) = — 1 , for v e Af(i), and 0 
elsewhere, andAf(i) denotes the set of neighbors of i. Thus, 
at each step we choose the node i e V that maximizes 

(4.7) 

-2Uii y]; Uvl -2Uin-l ^ Uvn-1^ 


vEAf(i) 


+ • • • + Cn-lC 


veAf(i) 


We remark that it is infeasible to compute all the n 
eigenvalues of a graph with n nodes, for very large n. Thanks 
to the skewed spectrum of real-world graphs fiSl , we can 
rely on the observation that only the top few eigenvalues have 


large magnitudes. This implies that the Cj terms in Equ. (4.3) 
and also Equ. (|4.7|) become much smaller for increasing j 


and can be ignored. Therefore, we use the top t eigenvalues 
to approximate the robustness of a graph. In the past, the 
skewed property of the spectrum has also been exploited to 
approximate triangle counts in large graphs 1381 . 

The outline of the GreedyRLS algorithm, its complex¬ 
ity analysis, and its adaptations for the RLS-PROBLEM vari¬ 
ants (^^) can be found in Appendix]^ 


4.2 Greedy Randomized Adaptive Search Procedure 
(GRASP) Approach The top-down approach makes a 
greedy decision at every step. If the desired subgraphs are 
small, however, this incurs many greedy decisions, espe¬ 
cially on large graphs where the number of greedy steps 
(n — s) would be excessive. Since the RLS -Problem does 
not exhibit monotonicity or semi-heredity properties ( §3.2| ), 
taking large number of greedy steps can yield poor perfor¬ 
mance. Therefore, we propose a bottom-up approach that 
performs local operations to build up solutions from scratch. 

Our local approach is based on a meta-heuristic called 
GRASP CSl for solving combinatorial optimization prob¬ 
lems. A GRASP, or greedy randomized adaptive search 
procedure, is a multi-start or iterative process, in which each 
iteration consists of two phases: (i) a construction phase, in 
which an initial feasible solution is produced, and (ii) a local 
search phase, in which a better solution with higher objec¬ 
tive value in the neighborhood of the constructed solution is 
sought. The best overall solution becomes the final result. 


The pseudo-code in Algorithm shows the general 
GRASP for maximization, where Tmax iterations are done. 
Eor maximizing our objective, we use / : S' M = A, 
i.e., the robustness function as given in Equ. We next 

describe the details of our two GRASP phases. 

Algorithm 1 GRASP-RLS 

Input: Graph G = (V, E) 

? T^max? /(•),5(-)> integers 
Output: Subset of nodes S* C 12, |S*| = s 
1 : /* = -oc, S* = 0 
2: forz = 1,2 ,..., Tmax do 
3: S ^ GRASP-RLS-Construction(G, g{’), s) 

4: S' ^ GRASP-RLS-LocalSearch(G, S, /(•), s) 

5: if f{S') > /* then S* ^ S, /* = f{S) 

6: end for 
7: return S* 


4.2.1 Construction In the construction phase, a feasible 
seed solution is iteratively constructed, one node at a time. 
At each iteration, the choice of the next node to be added is 
determined by ordering all candidate nodes G in a restricted 
candidate list, called RCL, with respect to a greedy function 
^ : G ^ M, and randomly choosing one of the candidates 
in the list. Candidate set in the first iteration is set to V and 
in later iterations it contains the nodes in the neighborhood 
A'{S) of the current solution S. The size of RCL is 
determined by a real parameter P G [ 0 , 1 ], which controls the 
amount of greediness and randomness. /3 = 1 corresponds to 
a purely greedy construction, while = 0 produces a purely 
random one. Algorithm [^describes our construction phase. 

Algorithm 2 GRASP-RLS-Construction 
Input: Graph G = {V, E), integer s 
Output: Subset of nodes S CV 
1: 

2: while |5'| < s do 

3: Evaluate g{v) for all G G 

4: c G- max^ec g{v), Q ^ min^ec g{v) 

5: Select /3 G [0,1] using a strategy 

6: RCL ^ {-z; G C\g{v) > c + P{c — c)} 

7: Select a vertex r from RCL at random 

8: S := SU{r},C ^N'{S)\S 

9: end while 
10: return S 

Selecting g(-): We aim to include locally dense nodes in 
our seed solutions. Therefore, in the first iteration of the 
construction we use g(v) = where t(v) denotes the 

number of local triangles of v, and d(v) is its degree. Initially 
the candidate set G is equal to the node set V, thus we 
approximate the local triangle counts for speed (381. In later 
iterations we use g(v) = AA^; the difference in robustness 
when a candidate node is added to the current subgraph. 
Selecting (3: Setting = 1 is purely greedy and produces 














the same seed subgraph in every GRASP iteration. To 
incorporate randomness while staying close to the greedy 
best-first selection, we choose [3 G [0.8,1] uniformly at 
random at every step. This produces high quality solutions in 
the presence of large variance in constructed solutions ca. 

4.2.2 Local Search A solution generated by GRASP- 
RLS-Construction is a preliminary one and may not 
necessarily have the best robustness. Thus, it is almost 
always beneficial to apply a local refinement procedure to 
each constructed solution. A local search algorithm works 
in an iterative fashion by successively replacing the current 
solution with a better one in the neighborhood of the current 
solution. It terminates when no better solution can be found. 
We describe our local search phase in Algorithmic 

As the RLS -Problem asks for a subgraph of size 
s, the local search takes as input an s-subgraph generated 
by construction and searches for a better solution around 
it by “swapping” nodes in and out. Ultimately it finds a 
locally optimal subgraph of size upper bounded by 
As an answer, it returns the best s-subgraph with the highest 
robustness found over the iterations. As such, GRASP- 
RLS-LocalSearch is an adaptation of a general local 
search procedure to yield subgraphs of desired size. 

Algorithm 3 GRASP-RLS-LocalSearch 
Input: Graph G = (U, E), S, integer s 
Output: Subset of nodes 5" G U, |S"| = 5 
1: more ^ TRUE, S' ^ S 
2: while more do 

3: if3veS such that A(5'\{i;}) > X{S) then 

4: S := S'\{v*} s.t. V* := max„g^(s)\s A(S'\{v}) 

5: if |5'| = s then S' ^ S end if 

6: else 

7: more ^ FALSE 

8 : end if 

9: add ^ TRUE 

10: while add and |5'| < 5 do 

11: if3v e N'{S)\S s.t. X{S U {v}) > X{S} then 

12: S := Su {v*}, V* := U {v}) 

13: more ^ TRUE 

14: if \S\ = s then S' ^ S end if 

15: else 

16: add ^ FALSE 

17: end if 

18: end while 

19: end while 
20: return S' 

The local search is guaranteed to terminate, as the 
objective value (i.e., subgraph robustness) improves with 
every iteration and it is upper-bounded by the robustness of 
the (s + 1)-clique. We provide the complexity analysis and 
the GRASP-RLS algorithm variants in Appendix 


Table 1: Real-world graphs. S: edge density. A: robustness 


Dataset 

n=\V\ 

m = \E\ 


A 

Jazz 

198 

2742 

0.1406 

34.74 

Celegans N. 

297 

2148 

0.0489 

21.32 

Email 

1133 

5451 

0.0085 

13.74 

Oregon-A 

7352 

15665 

0.0005 

42.29 

Oregon-B 

10860 

23409 

0.0004 

47.54 

Oregon-C 

13947 

30584 

0.0003 

52.10 

P2P-GnutellaA 

6301 

20777 

0.0010 

19.62 

P2P-GnutellaB 

8114 

26013 

0.0008 

19.45 

P2P-GnutellaC 

8717 

31525 

0.0008 

13.35 

P2P-GnutellaD 

8846 

31839 

0.0008 

14.46 

P2P-GnutellaE 

10876 

39994 

0.0007 

7.83 

DBLP 

317080 

1049866 

2.09 xlO"'’ 

103.18 

Web-Google 

875713 

4322051 

1.13x10"® 

99.36 


5 Evaluation 

We evaluate our methods extensively on numerous synthetic 
and real-world graphs. Our real graphs, as in Tablecome 
from various domains, including biological, email, Internet 
AS backbone, P2P, collaboration, and the Web. 

Our work is in the general lines of subgraph mining, 
however with a new objective based on robustness. The clos¬ 
est to our setting is the densest subgraph mining. There¬ 
fore, we compare our results to dense subgraphs found by 
Charikar’s algorithm 111 (which we refer to as Charikar), as 
well as by Tsourakakis et aids two algorithms l39l (which 
we refer to as Greedy and LS for local search following the 
convention in their work). We remark that the objectives 
used in those works are distinct; namely, average degree and 
edge-surplus, respectively, and are also different from ours. 

We first evaluate the accuracy of the algorithms against 
ground truth. To do so, we create synthetic graphs and inject 
a clique in each graph. Note that a clique is both the densest 
and the most robust subgraph of a certain size. Therefore, 
the algorithms are compared on the same grounds. 

Table provides precision, recall, and subgraph size 
\S\ averaged over ten Erdos-Renyi random graphs, with 
n = 3000 nodes and p = {0.5,0.1,0.008}, in which we 
inject a clique of size 30. p’s are selected to capture very 
dense, medium-dense, and sparse graphs. We notice that 
while all methods perform sufficiently well for sparse graphs 
withp = 0.008, accuracy of GRASP-RLS is superior to the 
competing methods at all densities. 

We also compare the algorithms on the Chung-Lu ran¬ 
dom power-law graphs la, with n = 3000 and power-law 
exponent varying from 2.2 to 3.1 as observed in real graphs 
(larger exponent implies a sparser graph), in which we inject 
a clique of 15 nodes. We run the LS algorithm seeded with 
one of the nodes of the clique as previously done in |3^ . 
while GRASP-RLS is not favored by such a selection. Pre¬ 
cision and recall curves averaged over ten graphs are given 
in Figure We note that while the accuracies of all methods 
improve by the increasing exponent as the task becomes eas- 














Table 2: Precision& Pecall (avg.) for our GRASP-RLS& GreedyRLS, Charikar ID, Greedy & LS |[39l on ten ER graphs. 


ER parameters 

GRASP-RLS 

GreedyRLS 

Charikar (Sj 

Greedy |39| 

Local Search |39| 

n 

P 

|5| 

II 

|5| 

7a 

II 

7a 

|5| 

P 

R 

|5| 

P 

R 

|5| 

P R 

3000 

0.5 

30 

0.97 

30 

0.02 

3000 

0.01 

1 

3000 

0.01 

1 

3000 

0.01 1 

3000 

0.1 

30 

1 

30 

0.95 

3000 

0.01 

1 

29.60 

0.99 

0.97 

20.63 

0.37 0.35 

3000 

0.008 

30 

1 

30 

0.99 

30 

1 

1 

30 

1 

1 

28.23 

0.94 0.93 
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Figure 3: Precision & Recall for our GRASP-RLS, Charikar (S), 
Greedy & Local Search 1^ vs. exponent of power-law graphs. 


ier, GRASP-RLS remains superior with robust performance 
at all host graph densities. 

Cliques are both the densest and the most robust sub¬ 
graphs, however, it is expected that the algorithms will find 
different subgraphs in general due to their distinct objectives. 
To understand their differences, we turn to real world graphs 
and compare the robust and dense subgraphs based on three 
main criteria: (a) robustness A as in Equ. ( |2.1| ), (b) triangle 
density t[5']/(l3l), and (c) edge density e[S]/ (y)- 

Table shows results on our largest graphs from each 
category. Note that the three algorithms we compare to try 
to find the densest subgraph without a size restriction. Thus, 
each one obtains a subgraph of a different size. To make the 
robust subgraphs (RS) comparable to the densest subgraphs 
(DS), we find subgraphs of size s equal to the ones found by 
Charikar, Greedy, and LS, respectively noted as sch, scr, 
and sls- As such, we compare to the best results achieved 
by each of the densest subgraph algorithms. 

We notice that densest subgraphs found by Greedy 
and LS are often substantially smaller than those found by 
Charikar, and also have higher edge density, which is the 
same observation as in 1391 . On the other hand, robust 
subgraphs have higher robustness than densest subgraphs, 
even at lower densities. This shows that high density does 
not always imply high robustness, and vice versa, illustrating 
the differences in the two problem settings. 

Thus far, we also note that GRASP-RLS consistently 
outperforms GreedyRLS, suggesting that the proposed 
bottom-up search is superior to the greedy top-down search. 

Figure shows the relative difference in robustness of 
GRASP-RLS subgraphs over again, the best results ob¬ 
tained by Charikar, Greedy, and LS on all of our real graphs. 
We achieve a wide range of improvements depending on the 
graph, where the difference is always positive. The improve¬ 
ments with respect to the LS results are the most pronounced. 



Figure 4: Robustness improvement (%) of GRASP-RLS over 
(top to bottom) best LS, Greedy, and Charikar results. 
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Figure 5: Subgraph robustness at varying subgraph sizes s. 

Comparisons in Tableand Figure]^ are for subgraphs 
at sizes where best results are obtained for each of the three 
densest subgraph algorithms. Our algorithms, on the other 
hand, accept a subgraph size input s. Thus, we next compare 
the competing methods at varying output sizes. Charikar 
and Greedy are both top-down methods, in which the lowest 
degree node is removed at each step and the best subgraph 
(best average degree or edge surplus, respectively) is output 
among all graphs created along the way. We modify these so 
that we pull out the subgraphs when size s is reached during 
the course of their run|j Figure shows that our GRASP- 
RLS produces subgraphs with higher robustness at varying 
sizes on two example graphs (similar results on others). This 
also shows that the densest subgraph approaches are not 
directly applicable to our problem. 

Experiments thus far illustrate that we find subgraphs 
with robustness higher than the densest subgraphs. These 
are relative results. To show that the subgraphs we find are in 
fact robust, we next quantify the magnitude of the robustness 
values we achieve through significance tests. 


^ Local search by (H finds locally optimal subgraphs, which are not guaranteed to 
grow to a given size s. Thus, we omit comparison to LS subgraphs at varying sizes. 
Figurej^shows that improvements over LS subgraphs are already substantially large. 



































Table 3: Comparison of robust and densest subgraphs. Ch: Charikar p8(, Gr: Greedy 1^ . Ls: Local search 


Data 

Method 

robustness A [S'] 

triangle density A [S] 

edge density 6[S] 


(SCh, SGr, SLs) 

Ch 

Gr 

Ls 

Ch 

Gr 

Ls 

Ch 

Gr 

Ls 


DS (271, 12, 13) 

13.58 

8.51 

4.96 

0.0009 

1.0000 

0.2237 

0.0600 

1.0000 

0.5897 


RS-Greedy 

13.94 

5.96 

6.27 

0.0001 

0.0696 

0.0606 

0.0523 

0.7576 

0.7179 


RS-GRASP 

14.04 

8.52 

8.91 

0.0007 

1.0000 

0.8671 

0.0508 

1.0000 

0.9487 

U 

DS (87,61,52) 

34.44 

30.01 

27.69 

0.0868 

0.1768 

0.2327 

0.3892 

0.5311 

0.5927 


RS-Greedy 

34.31 

24.70 

21.75 

0.0857 

0.1022 

0.1193 

0.3855 

0.4131 

0.4367 

O 

RS-GRASP 

34.47 

30.14 

28.01 

0.0870 

0.1775 

0.2375 

0.3884 

0.5301 

0.5943 


DS (386, 22, 4) 

8.81 

6.40 

0.86 

9.77E-06 

0.0 

0.0 

0.0306 

0.4372 

0.6667 

a. 

<N 

RS-Greedy 

9.10 

5.22 

0.86 

6.83E-06 

0.0 

0.0 

0.0267 

0.3593 

0.6667 

RS-GRASP 

9.22 

6.41 

1.29 

6.93E-06 

0.0 

0.5 

0.0270 

0.4372 

0.8333 

-Si 

DS (240, 105, 18) 

52.15 

47.62 

10.20 

0.0266 

0.2160 

0.4178 

0.2274 

0.4759 

0.7254 

RS-Greedy 

41.57 

22.56 

8.69 

0.0027 

0.0082 

0.2525 

0.0710 

0.1225 

0.6144 


RS-GRASP 

53.96 

48.68 

14.11 

0.0153 

0.1246 

1.0000 

0.1296 

0.3996 

1.0000 


Given a subgraph that GRASP-RLS finds, we bootstrap 
B = 1000 new subgraphs by rewiring its edges at random. 
We compute an empirical p-value for each subgraph by di¬ 
viding the number of randomly rewired subgraphs that have 
larger robustness by B. The p-value essentially captures the 
probability that we would be able to obtain a subgraph with 
robustness greater than what we find by chance if we were to 
create a topology with the same number of nodes and edges 
at random (note that all such subgraphs would have the same 
edge density). Thus a low p-value implies that, among the 
same density topologies, the one we find is in fact robust 
with high probability. 

Figure shows that the subgraphs we find on almost all 
real graphs are significantly robust at 0.05. For cases with 
large p-values, it is possible to obtain higher robustness sub¬ 
graphs with rewiring. For example, P2P-E is a graph where 
all the robust subgraphs (also the dense subgraphs) found 
contain very few or no triangles (see Table Therefore, 
rewiring edges that short-cut longer cycles they contain help 
improve their robustness. We remark that large p-values indi¬ 
cate that the found subgraphs are not significantly robust, but 
does not imply our algorithms are unable to find robust sub¬ 
graphs. That is because the rewired more robust subgraphs 
do not necessarily exist in the original graph G, and it is 
likely that G does not contain any subgraph with robustness 
that is statistically significant. 
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Figure 6: p-values of significance tests indicate that GRASP- 
RLS subgraphs have significantly large robustness. 


Next we analyze the performance of our GRASP- 
RLS approach in more detail. Recall that GRASP-RLS- 


CONSTRUCTION quickly builds a subgraph which GRASP- 
RLS -LocalSearch uses to improve over to obtain a better 
result. In Figure]^ we show the robustness of subgraphs ob¬ 
tained at construction and after local search on two example 
graphs for s = 50 and Tmax = 300. We notice that most of 
the GRASP-RLS iterations find a high robustness subgraph 
right at construction. In most other cases, local search is able 
to improve over construction results significantly. In fact, the 
most robust outcome on Oregon-A (Figure [7] left) is obtained 
when construction builds a subgraph with robustness around 
A = 6, which the local search improves over A = 20. 
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Figure 7: A achieved at GRASP-RLS-Construction versus 
after GRASP-RLS-LocalSearch. 


We next perform several case studies on the DBLP co¬ 
authorship network to qualitatively analyze our subgraphs. 
Here, we use the seeded variant of our problem (Appendix 
0- Christos Faloutsos is a prolific researcher with various 
interests. In Figure(a), we invoke his interest in databases 
when used with Rakesh Agrawal as seeds, as Agrawal is an 
expert in this field. Later in (b), we invoke his interest in 
data mining when we use Jure Leskovec as the second seed, 
who is a rising star in the field. Likewise in (c) and (d) 
we find robust subgraphs around other selected prominent 
researchers in data mining and databases. In (d) we show 
how our subgraphs change with varying size. Specifically, 
we find a clique that the seeds J. Widom and J. Ullman 
belong to, for s=10. The subgraph of s=15, while no longer 
a clique, remains stable in which other researchers like R. 
Motwani and H. Garcia-Molina are included. 
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(b) {C. Faloutsos, J. Leskovec} 
(5=0.78, A=0.51, A=5.0 




(c) {J. Han, C. C. Aggarwal} 
(5=0.91, A=0.78, A=6.0 


(d) {J. Widom, J. D. Ullman} 

(5=1, A=l, A=6.7 (5=0.8, A=0.58, A=9.0 


Figure 8: Robust DBLP subgraphs returned by our GRASP-RLS algorithm when seeded with authors indicated in (a)-(d). 


Given the local search characteristics of GRASP-RLS, 
its complexity is linear in host graph size, as theoretically 
shown in Appendix |F.1[ Figure also illustrates the linear 
scalability w.r.t. input graph size empirically|^ 
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Figure 9: Scalability of GRASP-RLS by graph size m and 
subgraph size s (run time avg’ed over 10 runs, bars: 25%-75%). 


6 Related Work 

Robustness is a critical property of networked systems. 
Thus, it has been studied extensively in various fields includ¬ 
ing physics, biology, mathematics, and networking. One of 
the early studies in measuring graph robustness shows that 
scale-free graphs are robust to random failures but vulnera¬ 
ble to intentional carefully-planned attacks (ll. This obser¬ 
vation has stimulated studies on the response of networks to 
various attack strategies ||6|[7l[T0l|T3]l221[26l. Other works 
look at how to design networks that are optimal with respect 
to some survivability criteria iniiioiEiiiiii. A vast body of 
these works focuses on global robustness of graphs at large. 

With respect to research on local robustness, Tra- 
janovski et al. aim to spot critical regions in a graph the 
destruction of which would cause the biggest harm to the net¬ 
work 071. Similar works aim to identify the critical nodes 
and links of a network (TH [25l [32l [34||. These works try to 
spot vulnerability points in the network, whereas our objec¬ 
tive is somewhat orthogonal: identify robust regions. Closest 
to ours, Andersen et al. consider spectral radius as an objec¬ 
tive criterion and propose algorithms for identifying small 
robust subgraphs with large spectral radius IS. 


'^We use nine Oregon graphs with various sizes 0, the largest three of which are 
listed in Table[^ Running time is averaged over Tmax iterations as one can run each 
pair of construction followed by local search phases completely in parallel. 


While having major distinctions as we illustrated in 
this work, robust subgraphs are related to dense subgraphs, 
which have been studied extensively. Finding the largest 
clique in a graph, well-known to be NP-complete |[T9ll . is 
also shown to be hard to approximate ll22ll . 

A relaxation of the clique problem is the densest sub¬ 
graph problem. Goldberg ED and Charikar m designed 
exact poly-time and ^-approximate linear-time solutions to 
this problem, respectively, where density is defined as the 
average degree. This problem is shown to become NP-hard 
when the size of the subgraph is restricted 0 . Most recently, 
Tsourakakis et al. 1391 also proposed fast heuristic solutions, 
where they define density as edge surplus; the difference be¬ 
tween number of edges and a fraction of maximum edges, 
for user-specified constant cf > 0. Likewise, Pei et al. study 
detecting quasi-cliques in multi-graphs 1^ . Other defini¬ 
tions include /c-cores, /c-plexes, and /c-clubs, etc. 1 ^ . 

Dense subgraph discovery is related to finding clusters 
in graphs, however with major distinctions. Most impor¬ 
tantly, dense subgraph discovery has to do with absolute den¬ 
sity where there exists a preset threshold for what is suffi¬ 
ciently dense. On the other hand, graph clustering concerns 
with relative density measures where density of one region is 
compared to another. Moreover, not all clustering objectives 
are based on density and not all types of dense subgraphs can 
be found by clustering algorithms l26l . 

In summary, while similarities among them exist, dis¬ 
covery of critical regions, robust subgraphs, cliques, densest 
subgraphs, and clusters are substantially distinct graph min¬ 
ing problems, for which different algorithms can be applied. 
To the best of our knowledge, our work is the first to consider 
identifying robust local subgraphs in large graphs. 

7 Conclusion 

We introduced the RLS-Problem of finding the most ro¬ 
bust local subgraph of a given size in large graphs, as well 
as its three practical variants. While our work bears similar¬ 
ity to densest subgraph mining, it differs from it in its ob¬ 
jective; robustness emphasizes subgraph topology more than 
edge density. We showed that our problem is NP-hard and 
that it does not exhibit semi-heredity or subgraph monotonic¬ 
ity properties. We designed two heuristic algorithms based 
on top-down and bottom-up search strategies, and showed 
how we can adapt them to address the problem variants. 
















We found that our bottom-up strategy provides consistently 
superior results, scales linearly with input graph size, and 
finds subgraphs with significant robustness. Experiments on 
synthetic and real graphs showed that our subgraphs are of 
higher robustness than densest subgraphs even at lower den¬ 
sities, which illustrates the novelty of our problem setting. 

Our research sets off several future directions, including 
the hardness analysis for the RGS-PROBLEM, exploration of 
new robustness measures with desirable properties, and the 
design of efficient algorithms for those new objectives. 
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Appendix 

A NP-Hardness Proof of RLS -Problem 

The decision version of the RLS -PROBLEM is as follows. 

PI. (robust s-subgraph problem RLS-Problem) Is there 
a subgraph S in graph G with |S'| = s nodes and robustness 
X{S) > a, for some a > 0? 

In order to show that PI is NP-Hard, we reduce from 
the NP-Complete s-clique problem P2 1191 . 

P2. (5-clique problem CL) Does graph G contain a clique 
of size si 

Proof, (of Theorem [lT]) 

It is easy to see that PI is in NP, since given a graph 
G we can guess the subgraph with s nodes and compute its 
robustness in polynomial time. 

In the reduction, we convert the instances as follows. An 
instance of CL is a graph G = (V^E) and an integer s. We 
pass G, s, and a = X{Gs) to RLS-Problem, where Gs is a 
clique of size s. We show that a yes instance of CL maps to 
a yes instance of RLS-Problem, and vice versa. 

First assume G is a yes instance of CL, i.e., there exists 
a clique of size s in G. Clearly the same is also a yes instance 
of RLS-Problem as X{Gs) > a. 

Next assume S is ^yes instance of RLS -PROBLEM, thus 
X{S) > A(Gs). The proof is by contradiction. Assume S 
is a subgraph with s nodes that is not a clique. As such, 
it should have one or more missing edges from Gg. Let us 
denote by Wk = trace(A^ ) the number of closed walks 
of length k in Gg. Deleting an edge from Gs, Wk will 
certainly not increase, and in some cases (e.g., for k = 2) 
will strictly decrease. As such, any s-subgraph S' of Gs 
with missing edges will have X{S') < X{Gs), which is a 
contradiction to our assumption that S' is a yes instance of 
the RLS-Problem. Thus, S should be an 5 -clique and also 
a yes instance of CL. □ 

B Properties of Robust Subgraphs 

In this section we study two properties of the RLS- 
Problem, namely semi-heredity (a.k.a. quasi-inheritance) 
and subgraph monotonicity. Analysis shows that our objec¬ 
tive formulation does not exhibit neither of these properties. 

B.l Semi-heredity : It is easy to identify a-robust graphs 
(i.e., X = a) that contain subsets of nodes that induce 
subgraphs with robustness less than a. As such, robust 
subgraphs do not display heredity. Here, we study a weaker 
version of heredity called semi-heredity or quasi-inheritance. 

Definition 1. (Semi-heredity) Given any graph G = 
{V^E) satisfying a property p, if there exists some v G V 
such that G — v = G[y\{i;}] also has property p, p is called 
a semi-hereditary property. 


Proof (of T heorem |3.2| ) 

The proof is by counter example. In particular, robust¬ 
ness of cliques is not semi-hereditary. Without loss of gener¬ 
ality, let Gk be a /c-clique. Then, X{Gk) = In {k — 

1)^). Any subgraph of Gk with k — 1 nodes is also a clique 
having strictly lower robustness, for k > 1, i.e., 

1 (e^-2 + (fc _ 2)1) < l(e'=-i + ik- 1)1) 

ke^-^ + l)e(^-i) + ElE 

e e 

ke’^-^ + k^-2k<{k- l)e’^ + (k^ -2k + l) 

<{k- l)e’‘ + 1 

where the inequality is sharp only fork = 1. Thus, for 
a = X{Gk), there exists no v such that Gk — v is at least 
(T-robust. □ 

B.2 Subgraph monotonicity : As we defined in §2.1[ 
our robustness measure can be written in terms of subgraph 
centrality as A(G) = log(AS'(G)). 

As S{G) is the total number of weighted closed walks, 
A is strictly monotonic with respect to edge additions and 
deletions. However, monotonicity is not directly obvious 
for node modifications due to the A factor in the definition, 
which changes with the graph size. 

Definition 2. (Subgraph monotonicity) An ob¬ 
jective function (in our case, robustness) R is subgraph 
monotonic if for any subgraph g = (V' ^ E') ofG = (V, E), 
V' EV andE' C E, R(g) < R{G). 

Proof (of T heorem |3.3| ) 

Assume we start with any graph G with robustness 
A(G). Next, we want to find a graph S with as large 
robustness as A(G) but that contains the minimum possible 
number of nodes Vmin- Such a smallest subgraph is in fact a 
clique, with the largest eigenvalue (Vmin — f) and the rest of 
the {Vmin — 1) eigenvalues equal to — l|^To obtain the exact 
value of Vmin, we need to solve the following 


A(G) = log 


1 

Lmin 


l)e-i) 


which, however, is a transcendental equation and is 
often solved using numerical methods. To obtain a solution 
quickly, we calculate a linear regression over (A(G), 14iin) 
samples. We show a sample simulation in Figurep^for Vmin 
1 to 100 where the regression equation is 

^Any subgraph g(C) of a /c-clique C has strictly lower robustness A. This is true 
when g(C) also contains k nodes, due to monotonicity of S(G) to edge removals 
(see Appendix]^. Any smaller clique has strictly lower robustness, see proof for semi¬ 
heredity. 
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— linear fitting 

80 - Vmin = 1.0295 * X{G) + 3.2826 
= 0.9998 



Figure 10: Relation between X{G) and Vm 


= 1.0295 * A(G) + 3.2826 

Irrespective of how one computes 14iin, we next con¬ 
struct a new graph G' = G U S' in which G is the original 
graph with n nodes and S' is a clique of size 14iin + 1- Let 
A = A(G) and A' = A(5"), and as such, A < A'. Then, we 
can write the robustness of G' as 


A(G') = In 


ne^ + {Vmin + l)e^ 

^ + ^min + 1 


ne^ + {Vmin + l)e^ _ , 

Itl-77-^- — A 

^ + Lmin + 1 

which shows that S', which is a subgraph of G', has 
strictly larger robustness than the original graph. This 
construction shows that A is not subgraph monotonic. □ 


C RLS-Problem Variants 

In this section, we describe three practical variants of the 
original RLS -PROBLEM problem. 

Given that robustness A is not subgraph-monotonic, it is 
natural to consider the problem of finding the subgraph with 
the maximum overall robustness in the graph (without any 
restriction on its size). We call this first variant the robust 
global subgraph problem or the RGS-Problem. Note that 
RGS is not necessarily the full graph. 

Problem 2. (RGS-Problem) Given a graph G = 
(V, E), find a subgraph S'* C V such that 

f{S*) = max /(S') . 

S* is referred as the most robust subgraph. 

Another variant involves finding the top k most robust 
s-subgraphs in a graph, which we call the /cRLS -Problem. 

Problem 3. (/cRLS-Problem) Given a graph G = 
(y, E), and two integers s and k, find k subgraphs S = 
S^,..., S^, each of size \S^\ = s, 1 < i < k such that 


f{Sl) > f{Sl) > ... > f{Sl) > /(S), vs C 1 /, |S| = s . 

S is referred as the top-fc most robust s-SUbgraphs. 

In the third and final variant, the goal is to find the 
most robust s-subgraph that contains a set of user-given seed 
nodes, called the Seeded-RLS-Problem. 

Problem 4. (Seeded-RLS-Problem) Given a graph 
G = {V,E), an integers, and a set of seed nodes U, |?7| < s, 
find a subgraph U C S* C of size |S*| = s such that 


f{S*) 


max 

UCSCV,\S\=s 


f{S)^ 


S* is referred as the most robust seeded 5-subgraph. 


It is easy to see that when k = 1 and G = 0, the kRLS- 
Problem and the Seeded-RLS-Problem respectively 
reduce to the RLS -Problem, and thus can easily be shown 
to be also NP-hard. A formal proof of hardness for the RGS- 
Problem, however, is nontrivial and remains an interesting 
open problem for future research. 


D Updating the Eigen-pairs 
D.l Updating the Eigen-values 

Proof (of Lemma |4T]) 

Using the relation Auj = AjUj between the eigenvalues 
and eigenvectors, we can write the updated relation as 


(A AA)(uj -f Auj) = {Xj + AAj)(uj -f Auj) 
Expanding the above, we get 

Auj + AAuj + AAuj + AAAuj 

(4.8) = AjUj + AAjUj + AjAuj + AAjAuj 

By concentrating on first-order approximation, we as¬ 
sume that all high-order perturbation terms are negligible, 
including AAAuj and AAjAuj. Further, by using the fact 
that Auj = XjUj (i.e., canceling these terms) we obtain 


(4.9) 


AAuj + AAuj = AAjUj + Aj Auj 


Next we multiply both sides by uj' and by symmetry of 
A and orthonormal property of its eigenvectors we get Equ. 
(|4.4|), which concludes the proof. 


D.2 Updating the Eigen-vectors 

Proof (of Lemma 

Using the orthogonality property of the eigenvectors, 
we can write the change Auj of eigenvector uj as a linear 
combination of the original eigenvectors: 









(4.10) 




i=l 


where a^'s are small constants that we aim to determine. 


Using Equ. ( |4.1Q| ) in Equ. ( |4.9| ) we obtain 

n n 

AAuj + A ^ aij Ui = AAjUj + Aj 

2=1 2=1 

which is equivalent to 

n n 

AAuj + ^ \iaij\i\ = AAjUj + \j ^ 
Multiplying both sides of the above by Uk', 7 ^ j, we get 

U-k ^Allj “h X]^CX]^j — \jOL]^j 

Therefore, 


Uk'AAuj 

= Xj-A, 


(4.11) 

for k j. To obtain ajj we use the following derivation. 


2=1 


1 + 2ajj + ^ a^j - 1 


2=1 


^ 1 + ‘2^0'jj + ajj + a^j — 1 

2 =l, 27 ^i 

n 

^ (1 + = 1 

2 = 1 , 2/7 


\ 


1 - E 4-1 

2 = 1 , 2/7 


Using the a^j’s as given by Equ. (4.11) and ajj = 0, 


E Greedy Top-down Search for RLS-Problem 
E. l GreedyRLS Algorithm See pseudo-code in Algo¬ 
rithmic 

Algorithm 4 GreedyRLS 

Input: Graph G = (U, E^), its adj. matrix A, integer s 
Output: Subset of nodes S'* C U, |S'*| = s 
1 : Compute top t eigen-pairs (Aj, uj) of A, 1 < 7 < f 
2 : Sn ^ V, X{Sn) = 

3: for z = n down to s -1- 1 do 

4: Select node i out of Vi G Sz that maximizes Equ. 


(4.7) for top t eigen-pairs of G[5';s], i.e. 


/ -2uii y]; Uvi -2uit Uvt 

/ = max Cl I e veAf(i) ^ .-\-Cte veAf{i) 

ies^ V 

where ci = and Cj = for 2 < 7 < t 

5.-1 := S,\{i}^HSz-i) = log^ 

Update A; A(:, i) = 0 and A(i,:) = 0 

ifz=f,f,f,...then 

Compute top t eigen-pairs (A 7 , uj) of A, 1 < 7 < f 

else 

Update top t eigenvalues of A by Equ. ( |4.4|) 

Update top t eigenvectors of A by Equ. ( |4.5| ) 
end if 
end for 

return S'* ^ Sz=s 


Uj'uj = 1 ^ (Uj + Auj)'(uj + Auj) = 1 
=> l + 2uj'Auj + IIAujll^ = 1 

After we discard the high-order term, and substitute Auj 
with Equ. (4.10) we get 1 + 2ajj = 1 ^ ajj = 0. 


We note that for a slightly better approximation, one can 
choose not to ignore the high-order term which is equal to 

n 

^ ^ij' Thus, one can compute ajj as 


E.2 Complexity analysis Algorithm has three main 
components: (a) computing top t eige npairs (LI): 0{nt + 
mi + nt^), (b) computing Equ. \A.l\ scores for all nodes 
using top t eigen-pairs (L4): 0{mt) = t^-di = 

2mt ), and (c) updating t eigenvalues: 0{dit) (using Equ . 
(4.6)) & updating t eigenvectors: 0{nt^) (using Equ. (4.5)) 
when a node i is removed (LIO & LI 1, respectively). 


we can see that Auj in Equ. ( |4.10 ) is equal to Equ. (4.5) 


Step (a) is executed only once, whereas (b) and (c) 
are performed at every step for n — s, i.e., 0{n) steps in 
general for small constant s. Performing (b) for all nodes 
at every iteration thus takes 0{mtn). Moreover, performing 
(c) iteratively for all nodes requires dit = t di = 
2mt, i.e., 0{mt) for eigenvalues and 0{t^in?) for 

eigenvectors. Therefore, the overall complexity becomes 
(9(max(tmn, 

As we no longer would have small perturbations to the 
adjacency matrix over many iterations, updating the eigen¬ 
pairs at all steps would yield bad approximations. As such, 
we recompute the eigen-pairs at every f, f, f, • • • steps. 
Performing recomputes less frequently in early iterations is 
reasonable, as early nodes are likely the peripheral ones that 
do not affect the eigen-pairs much, for which updates would 
suffice. When perturbations accumulate over iterations and 
especially when we get closer to the solution, it becomes 

























beneficial to recompute the eigen-pairs more frequently. 

In fact, in a greedier version one can drop the eigen- 
pair updates (LlO-11), so as to only perform O(logn) 
recomputes (L8), in which case the complexity becomes 
0(max(tm log n, t^n log n)). 

E. 3 Algorithm Variants To adapt our GreedyRLS al¬ 
gorithm for the /cRLS -Problem, we can find one subgraph 
at a time, remove all its nodes from the graph, and continue 
until we find k subgraphs or end up with an empty graph. 
This way we generate node-disjoint robust subgraphs. One 
can also create edge-disjoint subgraphs if instead of remov¬ 
ing the nodes of each found subgraph, one removes the edges 
among the nodes of the subgraph. 

For the Seeded-RLS-Problem, we can condition to 
never remove nodes u e U that belong to the seed set. 

GreedyRLS algorithm is particularly suitable for the 
RGS-Problem, where we can iterate for z = n,..., Vmirfl 
record the robustness X{Sz) for each subgraph at each step 
(Alg|^ L5), and return the subgraph with the maximum 
robustness among all the Sz's. 

F Greedy Randomized Adaptive Search Procedure 

(GRASP) for RLS-Problem 

F. l Complexity analysis The size of subgraphs |5'| ob¬ 

tained during local search is 0{s). Computing their top t 
eigen-pairs takes where we use e([S']) = O(s^) 

as robust subgraphs are often dense. To find the best improv¬ 
ing node (L12), all nodes in the neighborhood JV{S)\S are 
evaluated, with worst-case size 0{n). As such, each expan¬ 
sion costs 0{ns^t + nst^). With deletions incorporated (L3- 
4), the number of expansions can be arbitrarily large ll24ll . 
however assuming 0(s) expansions are done, overall com¬ 
plexity becomes 0(ns^t + ns^t^). If alH = |S| eigen-pairs 
are computed, the complexity is quadruple in s and linear in 
n, which is feasible for small s. Otherwise, we exploit eigen- 
pair updates as in Greed yRLS to reduce computation. 

F.2 Algorithm Variants Adapting GRASP-RLS for the 
/cRLS -Problem can be easily done by returning the best 
k (out of Tmax > k) distiuct subgraphs computed during 
the GRASP iterations in Algorithmic These subgraphs are 
likely to overlap, although one can incorporate constraints as 
to the extent of allowed overlap. 

For the Seeded-RLS-Problem, we can initialize set 
S with the seed nodes U in construction (Alg. (CLI) while, 
during the local search phase, we never discard a node u e U 
from S (Alg(CL4). 

Finally, for the RGS-PROBLEM, we can waive the size 
constraint in the expansion step of local search (Alg [CLIO). 

^VLiin denotes the minimum number of nodes a clique C with robustness at least 
as large as the full graph’s robustness would contain. Any subgraph of the clique C has 
lower robustness (see Appendixand hence would not qualify as the most robust 
subgraph. 



